**Response Letter for CAL-2019-10-0108**

**Title:** A High-Performance Design of Generalized Pipeline Cellular Array

**Authors**: Zhufei Chu, Huiming Tian, Zeqiang Li, Yinshui Xia, and Lunyao Wang.

----------------------------------------------------------------------------------------------------------------------

We thank the editors and reviewers for conducting the peer-review of our manuscript. In the revised version, we have fully addressed the reviewers’ comments. All changes with respect to the original submission have been highlighted in red in the revised manuscript. The provided comments have helped to improve the technical contents as well as the presentation of our work.

----------------------------------------------------------------------------------------------------------------------

EIC: First, I apologize for the horrible delay in getting this decision to you.  Your submission was plagued by an inability to find reviewers, tardy reviewers, and a tardy associate editor.   This shouldn't happen, and I apologize.   Regarding the paper itself, the reviewers are consistent in the issues they raise.  They feel there could be a contribution in your work, but it's not clear with respect to prior work.

**Response:** Thank you very much for providing us a fair review with many useful comments. Based on your recommendation, we have performed a major revision to address all concerns raised by the three respectable reviewers. Specifically, several additional experiments and comparisons have been added to the revised paper. We have fully addressed the reviewers’ comments. All changes with respect to the original submission have been highlighted in red in the revised manuscript.

----------------------------------------------------------------------------------------------------------------------

**Reviewer's Comments to Author：**

**Reviewer: 1**

**Recommendation:** Revise & Resubmit (paper requires more than a minor revision)

**Comments:**  
(There are no comments. Please check to see if comments were included as a file attachment with this e-mail or as an attachment in your Author Center.)  
  
**Additional Questions:**  
1. Confidence in your review: Low  
  
2. What is the important insight or idea that this paper contributes?: Comparison of generalized pipeline array using QCA.

3. What is the potential impact of this paper?: Comparison with previous results.  
  
4. Is the manuscript technically sound?  Please explain your answer in  
the "revisions" or "comments for the authors" sections below.: Appears to be – but didn’t check completely  
  
5. What revisions do you suggest for the paper?:

**Comment 1-1:**The author should clearly explain how they have come up with the design given in  Figure 5. Figure 5 is not mentioned in the text at all.

**Response:** Thanks for your suggestion. Figure 5 is used to give the QCA layout and clock distribution of our proposed GPCA. The method of how to design it has been proposed in [8] (Reference [5] in revised vision). Thus, it was not specifically mentioned in the previous paper. As the reviewer 3 commented, we have deleted Figure 5 and added more details to enrich our article to make the design clearer and easier to understand.

**Comment 1-2:**Equations 1, 3 and others are given in reference 9. I fail to understand why authors do not describe their papers by building on reference 9 rather than building on reference 8.

**Response:** Thanks for your kind suggestion. The Boolean expressions of GPCA are firstly proposed by [8] ([5] in the revised version) which are based on two-input XOR/AND/OR and single-input NOT operations. In fact, designs of ours and [9] ([3] in the revised version) are both based on the proposed expressions of [8] and implemented with QCA, but adopt different methods. To exploit the QCA implementation of GPCA, the authors in [9] proposed a functional equivalent MAJ/NOT based logic network. Thus, we proposed XOR-3/MAJ/INV based method to implement the GPCA in QCA and then compare the results with [9].

6. Other comments for the authors:

**Comment 1-3:**The authors should start their papers by giving a review of reference 9 and then illustrate and prove how they have made modifications in reference 9.

**Response:** Thanks for your kind suggestion. We have added related works in section 2 and given a review of [9] ([3] in revised vision) in section 3.2. The modifications are discussed in section 4 in detail.

**Reviewer: 2**  
**Recommendation:** Revise & Resubmit (paper requires more than a minor revision)  
  
**Comments:**  
(There are no comments. Please check to see if comments were included as a file attachment with this e-mail or as an attachment in your Author Center.)  
  
**Additional Questions:**  
1. Confidence in your review: High  
  
2. What is the important insight or idea that this paper contributes?: This paper proposed a new method for efficient hardware implementation of quantum-dot cellular automata.  
  
3. What is the potential impact of this paper?: Minor improvement on the existing QCA implementation  
  
4. Is the manuscript technically sound?  Please explain your answer in  
the "revisions" or "comments for the authors" sections below.: Appears to be – but didn’t check completely  
  
5. What revisions do you suggest for the paper?:

This paper proposed a new method for efficient hardware implementation of quantum-dot cellular automata. I have the following comments regarding the paper:

**Response:** Many thanks for your appreciation of our proposed design and the positive comments.

**Comment 2-1:** Although the topic looks interesting, I am not sure how much contribution this paper adds to the state-of-the-art. In other words, the paper came up with different hardware implementation for the work proposed in reference [9].

**Response:** Thanks for your kind suggestion. In reference [9] ([3] in revised vision), they gave the designs of basic logic arithmetic unit and control unit, and then they gave designs of 3-, 4-, 5-bit GPCA. The state-of-art in [9] is not specific one but all the designs proposed in [9]. Thus, for fair comparison, we firstly optimize the logic arithmetic unit and control unit and then also give our designs of 3-, 4-, 5-bit GPCA. Section 5.2 gives the comparison with the state-of-art and all the comparison results are shown in TABLE 3. On average, the proposed design has reduced the area, latency, and cells count by 39.27%, 35.89%, and 43.25%, respectively.

**Comment 2-2:** As the authors mentioned the new hardware implementation resulted in higher area efficiency and lower latency. If QCA is the way the future technology is going to go, we need more innovations to push this technology into the market.

**Response:** Thanks for your kind suggestion. Firstly, we have added a discuss of the importance of wire-crossing problem in the second paragraph of section 3.1 and some details have been discussed in section 4. Wire-crossing may lead to many difficulties, including crosstalk and additional power dissipation. Thus, we should reduce wire-crossing when we implement a circuit in QCA as many as possible. From TABLE 1 we can see that just for the AU part, our design reduces the number of wire-crossing from 27 of reference [9] ([3] in revised vision) to only 5 which has a great improvement. In addition, we have added another single-layer design of AU based on clock-zone approach to make our paper more convincing in section 5.1.

**Comment 2-3:** The paper lacks architectural innovation. I believe the proposed method needs to be combined with some innovation in the hardware to show the actual benefit. The reported improvements are coming from the gate-level implementation, rather than architecture.

**Response:** Thanks for your kind suggestion. We have added more details in the second and third paragraph of section 1. In addition, our proposed method mainly based on the combinations of majority-of-three (MAJ), inverter (NOT), and three-input exclusive-OR (XOR3). However, the basic logic unit MAJ hasn’t yet realized at the physical level. Taken this situation into account, we choose QCA as the alternative implementation to realize our design because the basic logic gates of QCA are MAJ and NOT which can realize most circuits in CMOS by naïve one-to-one mapping. As for gate-level implementation, it can be seen from TABLE 1 that our design reduces the total number of gates from 21 to 10 for AU and CU part. Moreover, this will become more obvious when it comes to *n*-bit GPCA.

My suggestion to authors is to also focus on architecture level to show the actual benefit.

**Response:** Many thanks for your positive comments of our proposed design.

6. Other comments for the authors:  
  
**Reviewer: 3**  
**Recommendation:** Minor Revision  
  
**Comments:**(There are no comments. Please check to see if comments were included as a file attachment with this e-mail or as an attachment in your Author Center.)  
  
**Additional Questions:**  
1. Confidence in your review: Medium  
  
2. What is the important insight or idea that this paper contributes?: This paper highlights prior work in design of quantum-dot cellular automata (QCA) including the publication of a generalized pipelined cellular array (GPCA) cell. The authors identify inefficiencies in the circuit design of the arithmetic unit and control unit of the GPCA design and propose a new design that leads to a reduction in area and critical path latency by 39.27% and 35.89% respectively.

3. What is the potential impact of this paper?: This paper could lead to more efficient implementations of QCAs and is a clear advancement over existing state-of-the-art. It is unclear to me the future impact of QCAs in general, but the work is well described and the improvement is impressive.

4. Is the manuscript technically sound?  Please explain your answer in  
the "revisions" or "comments for the authors" sections below.: Yes  
  
5. What revisions do you suggest for the paper?

**Comment 3-1:** It would be nice to better highlight the downsides of prior work. I felt section 2.1 ended abruptly without describing the issues with these gate counts.

Response: Thanks for your kind suggestion. We have added the downsides of prior work in section 3.2 in detail. It can be found that only four MAJ gates are fully utilized (i.e., with no constant inputs), the remaining 11 MAJ gates are partially utilized (i.e., with one constant inputs). Further improvement can be achieved if we make better use of MAJ gates [1] or introducing new XOR 3 primitive.

**Comment 3-2:**A brief discussion/overview of the wire-crossing problem would be nice. I understand at a very high level from VLSI that wire-crossings are difficult but are they \*more\* difficult for QCA design and layout? If so, briefly explain why and give a sense for how painful it is. That will help readers understand how important your contribution is.

**Response:** Thanks for your kind suggestion. We have added a brief introduction on the wire-crossing problem in the second paragraph of section 3.1. Wire-crossing may lead to many difficulties, including crosstalk and additional power dissipation [9]. There are two main kinds of QCA wire-crossing, as QCA can be implemented using both single-layer and multi-layer strategies. The single-layer approach deal with wire-crossing by two different quantum dot orientations, i.e., one at 45 degrees to the other. Another approach is based on clock-zone, which using phase difference to represent crossing wires [11]. In contrast, a multi-layer method is more flexible as the QCA wire can route in a three-dimensional way. However, single-layer QCA is more practical for fabrication.

**Comment 3-3:**It’s difficult in text to understand the trade-offs between different design choices. For example, following all of the changes in various gate counts of page 2 is tedious for the reader. Putting these numbers into a table or using relative numbers when comparing all design choices of each expression would be helpful.

**Response:** Thanks for your kind suggestion. We have added a table to compare all designs in TABLE 1.

**Comment 3-4:**It would also be nice to have some sort of progressive information to show a “path” to your final design through the various design points you evaluated.

**Response:** Thanks for your kind suggestion. We have revised the paper in a certain order according to your suggestions. Now it may becomes easier to understand the progressive information to our final design.

6. Other comments for the authors:

**Comment 3-5:**There is plenty of white space and text for you to compress to add more to the paper. For example. The right column of the first page is very sparse, and the text below section 2 could be completely deleted if that space is useful to add more figures or explanatory text. X

**Response:** Thanks for your valuable comments. We have added more details to our paper and have filled the white space.

**Comment 3-6:**Figure 5 is very large and for me did not add much value to the paper. Is it to show that there are few wire crossings? Is there any way to shrink it without making it unreadable? Or maybe consider removing it? (although it is a pretty figure!)

**Response:** Thanks for your kind suggestion. We have deleted Figure 5 and more details have been added to make the paper easier and clearer to understand.

**Comment 3-7:**Instead of using a citation as a noun (i.e. “We compare against [9]” or “[9] used 4 gates”) you should really use the proper name for the technique and then cite it. I think GPCA[9] would be clearer. This makes Table 1 especially difficult to parse on first read.

**Response:** Thanks for your kind suggestion. We have replaced [9] ([3] in revised vision) with GPCA[9] (GPCA[3] in revised vision).